Showing posts with label differential privacy. Show all posts

DEFCON Differential Privacy Training Launch

Tuesday, August 18, 2020

Differential privacy is a technique that enables organizations to learn from the majority of their data while simultaneously ensuring those results do not allow an individual’s data to be distinguished or re-identified. A popular way of attaining differential privacy is by adding noise to the data, which provides mathematical bounds on the amount of information that is leaked. Our open source offering aims to help developers implement differential privacy.

In the summer of 2019, we publicly launched our Differential Privacy Library. Since then, we’ve expanded it from just C++ to also include Go and Java.

We’ve come to realize that differential privacy requires more than just the library to be effectively implemented. We mentioned in a post earlier this summer that we want all developers to be able to interact with differential privacy, which requires more than an open-sourced library, but rather a training on the topic to share knowledge with all developers.

Our goal with this training is to provide a head start that is helpful for those considering differential privacy implementation. We also want to provide an experience on privacy and security that is understood and impactful to any individual in the field, whether they are a beginner or someone who has background knowledge in privacy.

This new training contains several steps and covers many topics, such as:

The foundations of differential privacy
Explanations as to why aggregation by itself may not hedge against privacy risks
The mathematical behind-the-scenes of noise
Tools that can be used in conjunction with differential privacy
Codelabs that users can take (in Go)
Additional resources to address any further questions

Step 1: Take our survey! It only takes five minutes!

This survey enables us to gain insights into what you are expecting to gain from this training. We are curious about what your objectives and goals are with this training, and if you have any experience with differential privacy.

Step 2: Check-out an introductory video to Differential Privacy!

We introduce topics like data aggregation, k-anonymity, differential privacy, noise, and others. The goal of this module is to introduce the foundations behind the differential privacy, and why it is an important and useful privacy tool.

Step 3: Try-out our codelabs

We have provided Codelabs in Go to help you practice implementing Differential Privacy library end-to-end.

Step 4: Learn more about differential privacy.

We want to offer an additional resource to help answer any questions you may have. If you have other resources that you find, please let us know and we will add these links to our overall training.

Step 5: Provide us with some feedback

Please use this survey as a platform to share your experience with this pilot. Did the content meet your expectations? Did it make sense? What was missing? This is the time for you to share your point of view and any pain points you experienced (as well as any positive aspects you encountered).

We hope this training provides an impactful experience from beginner coders to privacy specialists. The public differential privacy training will launch at the Stanford Biodesign: “Building for Digital Health” Buildathon, Sept 11-13, 2020, led by Stanford, and supported by Google Cloud and Apple Health engineers.

Please continue to reach out to us to share your experiences with us at differential-privacy-feedback@google.com. The suggestions we receive will help us improve and it will inform our thinking as we add new features and updates.

Acknowledgements: Miguel Guevara, Bryant Gipson, Royce Wilson, Kate Frankenberg, Katie Holzheimer, Lior Gottleib, Carmen Bush

By Aditi Joshi – Security and Privacy Engineering, Google Cloud

Expanding our Differential Privacy Library

Wednesday, June 24, 2020

All developers have a responsibility to treat data with care and respect. Differential privacy helps organizations derive insights from data while simultaneously ensuring that those results do not allow any individual's data to be distinguished or re-identified. This principled approach supports data computation and analysis across many of Google’s core products and features.

Last summer, Google open sourced our foundational differential privacy library so developers and organizations around the world can benefit from this technology. Today, we’re announcing the addition of Go and Java to our library, an end-to-end solution for differential privacy: Privacy on Beam, and new tools to help developers implement this technology effectively.

We’ve listened to feedback from our developer community and, as of today, developers can now perform differentially private analysis in Java and Go. We’re working to bring these two libraries to full feature parity with C++.

We want all developers to have access to differential privacy, regardless of their level of expertise. Our new Privacy on Beam framework captures years of Googler developer experience and efficiency improvements in a comprehensive and easy-to-use solution that handles computation end-to-end. Built on Apache Beam, Privacy on Beam can reduce implementation mistakes, and take care of all the steps that are essential to differential privacy, including noise addition, partition selection, and contribution bounding. If you’re new to Apache Beam or differential privacy, our codelab can get you started.

Tracking privacy budgets is another challenge developers face when implementing differential privacy. So, we’re also releasing a new Privacy Loss Distribution tool for tracking privacy budgets. With this tool, developers can maintain an accurate estimate of the total cost to user privacy for collections of differentially private queries, and better evaluate the overall impact of their pipelines. Privacy Loss Distribution supports widely used mechanisms (such as Laplace, Gaussian, and Randomized response) and can scale to hundreds of compositions.

We hope these new languages, tools, and features unlock differential privacy for even more developers. Continue to share your stories and suggestions with us at dp-open-source@google.com—your feedback will help inform our future differential privacy launches and updates.

Acknowledgements

Software Engineers: Yurii Sushko, Daniel Simmons-Marengo, Christoph Dibak, Damien Desfontaines, Maria Telyatnikova, Dennis Kraft, Jimmy Ross, Vadym Doroshenko
Research Scientists: Pasin Manurangsi, Ravi Kumar, Sergei Vassilvitskii, Alex Kulesza, Jenny Gillenwater, Kareem Amin

By: Miguel Guevara, Mirac Vuslat Basaran, Sasha Kulankhina, and Badih Ghazi – Google Privacy Team and Google Research

Enabling Developers and Organizations to Use Differential privacy

Friday, September 6, 2019

Originally posted on the Google Developers Blog

By: Miguel Guevara, Product Manager, Privacy and Data Protection Office

Whether you're a city planner, a small business owner, or a software developer, gaining useful insights from data can help make services work better and answer important questions. But, without strong privacy protections, you risk losing the trust of your citizens, customers, and users.

Differentially-private data analysis is a principled approach that enables organizations to learn from the majority of their data while simultaneously ensuring that those results do not allow any individual's data to be distinguished or re-identified. This type of analysis can be implemented in a wide variety of ways and for many different purposes. For example, if you are a health researcher, you may want to compare the average amount of time patients remain admitted across various hospitals in order to determine if there are differences in care. Differential privacy is a high-assurance, analytic means of ensuring that use cases like this are addressed in a privacy-preserving manner.

Today, we’re rolling out the open-source version of the differential privacy library that helps power some of Google’s core products. To make the library easy for developers to use, we’re focusing on features that can be particularly difficult to execute from scratch, like automatically calculating bounds on user contributions. It is now freely available to any organization or developer that wants to use it.

A deeper look at the technology

Our open source library was designed to meet the needs of developers. In addition to being freely accessible, we wanted it to be easy to deploy and useful.

Here are some of the key features of the library:

Statistical functions: Most common data science operations are supported by this release. Developers can compute counts, sums, averages, medians, and percentiles using our library.
Rigorous testing: Getting differential privacy right is challenging. Besides an extensive test suite, we’ve included an extensible ‘Stochastic Differential Privacy Model Checker library’ to help prevent mistakes.
Ready to use: The real utility of an open-source release is in answering the question “Can I use this?” That’s why we’ve included a PostgreSQL extension along with common recipes to get you started. We’ve described the details of our approach in a technical paper that we’ve just released today.
Modular: We designed the library so that it can be extended to include other functionalities such as additional mechanisms, aggregation functions, or privacy budget management.

Investing in new privacy technologies

We have driven the research and development of practical, differentially-private techniques since we released RAPPOR to help improve Chrome in 2014, and continue to spearhead their real-world application.

We’ve used differentially private methods to create helpful features in our products, like how busy a business is over the course of a day or how popular a particular restaurant’s dish is in Google Maps, and improve Google Fi.

This year, we’ve announced several open-source, privacy technologies—Tensorflow Privacy, Tensorflow Federated, Private Join and Compute—and today’s launch adds to this growing list. We're excited to make this library broadly available and hope developers will consider leveraging it as they build out their comprehensive data privacy strategies. From medicine, to government, to business, and beyond, it’s our hope that these open-source tools will help produce insights that benefit everyone.

Acknowledgements

Software Engineers: Alain Forget, Bryant Gipson, Celia Zhang, Damien Desfontaines, Daniel Simmons-Marengo, Ian Pudney, Jin Fu, Michael Daub, Priyanka Sehgal, Royce Wilson, William Lam

opensource.google.com

Google Open Source Blog